HiRF: Hierarchical Random Field for Collective Activity Recognition in Videos
نویسندگان
چکیده
This paper addresses the problem of recognizing and localizing coherent activities of a group of people, called collective activities, in video. Related work has argued the benefits of capturing long-range and higher-order dependencies among video features for robust recognition. To this end, we formulate a new deep model, called Hierarchical Random Field (HiRF). HiRF models only hierarchical dependencies between model variables. This effectively amounts to modeling higher-order temporal dependencies of video features. We specify an efficient inference of HiRF that iterates in each step linear programming for estimating latent variables. Learning of HiRF parameters is specified within the max-margin framework. Our evaluation on the benchmark New Collective Activity and Collective Activity datasets, demonstrates that HiRF yields superior recognition and localization as compared to the state of the art.
منابع مشابه
Receptive Field Encoding Model for Dynamic Natural Vision
Introduction: Encoding models are used to predict human brain activity in response to sensory stimuli. The purpose of these models is to explain how sensory information represent in the brain. Convolutional neural networks trained by images are capable of encoding magnetic resonance imaging data of humans viewing natural images. Considering the hemodynamic response function, these networks are ...
متن کاملModeling Collective Crowd Behaviors in Video
Crowd behavior analysis is an interdisciplinary topic. Understanding the collective crowd behaviors is one of the fundamental problems both in social science and natural science. Research of crowd behavior analysis can lead to a lot of critical applications, such as intelligent video surveillance, crowd abnormal detection, and public facility optimization. In this thesis, we study the crowd beh...
متن کاملHierarchical multi-channel hidden semi Markov graphical models for activity recognition
Recognizing human actions from a stream of unsegmented sensory observations is important for a number of applications such as surveillance and human-computer interaction. A wide range of graphical models have been proposed for these tasks, and are typically extensions of the generative hidden Markov models (HMM) or their discriminative counterpart, conditional random fields (CRF). These extensi...
متن کاملIncremental learning of human activity models from videos
Learning human activity models from streaming videos should be a continuous process as new activities arrive over time. However, recent approaches for human activity recognition are usually batch methods, which assume that all the training instances are labeled and present in advance. Among such methods, the exploitation of the inter-relationship between the various objects in the scene (termed...
متن کاملHuman Activity Learning using Object Affordances from RGB-D Videos
Human activities comprise several sub-activities performed in a sequence and involve interactions with various objects. This makes reasoning about the object affordances a central task for activity recognition. In this work, we consider the problem of jointly labeling the object affordances and human activities from RGBD videos. We frame the problem as a Markov Random Field where the nodes repr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014